Skip to content

Block import improvements#10373

Merged
lrubasze merged 96 commits intomasterfrom
lrubasze/block-import-improvements
Feb 13, 2026
Merged

Block import improvements#10373
lrubasze merged 96 commits intomasterfrom
lrubasze/block-import-improvements

Conversation

@lrubasze
Copy link
Copy Markdown
Contributor

@lrubasze lrubasze commented Nov 20, 2025

This PR fixes block import during Warp sync, which was silently failing due to "Unknown parent" errors - a typical case during Warp sync and the full_node_warp_sync test was not detecting such failure.

Changes

  • Relaxed verification for Warp synced blocks:
    The fix relaxes verification requirements for Warp synced blocks by not performing full verification, with the assumption that these blocks are part of the finalized chain and have already been verified using the provided warp sync proof.
  • New BlockOrigin variants:
    For improved clarity, two additional BlockOrigin items have been introduced:
    • WarpSync
    • GapSync
  • Gap sync improvements:
    Warp synced blocks are now skipped during the gap sync block import phase, which required improvements to gap handling when committing the block import operation in the database.
  • Enhanced testing:
    The Warp sync zombienet test has been modified to more thoroughly assert both warp and gap sync phases.

This PR builds on changes by @sistemd in #9678

@lrubasze lrubasze changed the base branch from sistemd/pruned-gap-sync-consensus-broadcast to master November 25, 2025 13:10
@lrubasze
Copy link
Copy Markdown
Contributor Author

lrubasze commented Feb 4, 2026

/cmd fmt

Comment thread substrate/client/db/src/lib.rs Outdated
@lrubasze lrubasze added this pull request to the merge queue Feb 13, 2026
Merged via the queue into master with commit b2a4296 Feb 13, 2026
243 of 245 checks passed
@lrubasze lrubasze deleted the lrubasze/block-import-improvements branch February 13, 2026 10:17
lrubasze added a commit that referenced this pull request Mar 4, 2026
This PR fixes block import during Warp sync, which was silently failing
due to "Unknown parent" errors - a typical case during Warp sync and the
`full_node_warp_sync` test was not detecting such failure.

Changes
 - Relaxed verification for Warp synced blocks:
The fix relaxes verification requirements for Warp synced blocks by not
performing full verification, with the assumption that these blocks are
part of the finalized chain and have already been verified using the
provided warp sync proof.
- New `BlockOrigin` variants:
For improved clarity, two additional `BlockOrigin` items have been
introduced:
  - `WarpSync`
  - `GapSync`
- Gap sync improvements:
Warp synced blocks are now skipped during the gap sync block import
phase, which required improvements to gap handling when committing the
block import operation in the database.
- Enhanced testing:
The Warp sync zombienet test has been modified to more thoroughly assert
both warp and gap sync phases.

This PR builds on changes by @sistemd in #9678

---------

Co-authored-by: sistemd <enntheprogrammer@gmail.com>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@lrubasze lrubasze added the A4-backport-stable2603 Pull request must be backported to the stable2603 release branch label Mar 4, 2026
paritytech-release-backport-bot Bot pushed a commit that referenced this pull request Mar 4, 2026
This PR fixes block import during Warp sync, which was silently failing
due to "Unknown parent" errors - a typical case during Warp sync and the
`full_node_warp_sync` test was not detecting such failure.

Changes
 - Relaxed verification for Warp synced blocks:
The fix relaxes verification requirements for Warp synced blocks by not
performing full verification, with the assumption that these blocks are
part of the finalized chain and have already been verified using the
provided warp sync proof.
- New `BlockOrigin` variants:
For improved clarity, two additional `BlockOrigin` items have been
introduced:
  - `WarpSync`
  - `GapSync`
- Gap sync improvements:
Warp synced blocks are now skipped during the gap sync block import
phase, which required improvements to gap handling when committing the
block import operation in the database.
- Enhanced testing:
The Warp sync zombienet test has been modified to more thoroughly assert
both warp and gap sync phases.

This PR builds on changes by @sistemd in #9678

---------

Co-authored-by: sistemd <enntheprogrammer@gmail.com>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit b2a4296)
@paritytech-release-backport-bot
Copy link
Copy Markdown

Successfully created backport PR for stable2603:

lrubasze added a commit that referenced this pull request Mar 6, 2026
Backport #10373 into `stable2603` from lrubasze.

This PR is needed for #10752, which allows to significantly optimize DB
and network usage when Gap sync is used (bodies are no longer
requested). It also reduces gap sync duration.

See the
[documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md)
on how to use this bot.

<!--
  # To be used by other automation, do not modify:
  original-pr-number: #${pull_number}
-->

---------

Co-authored-by: Lukasz Rubaszewski <117115317+lrubasze@users.noreply.github.com>
Co-authored-by: sistemd <enntheprogrammer@gmail.com>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sebastian Kunert <skunert49@gmail.com>
lexnv added a commit that referenced this pull request Mar 6, 2026
This reverts commit b2a4296.

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
github-merge-queue Bot pushed a commit that referenced this pull request Mar 12, 2026
… state (#11330)

This PR skips the execution of blocks when they are propagated to
importing via `StateAction::Skip`.

There is a bug in the import queue that is affecting collators, which is
that they should not execute blocks for non-archive collators that are
part of Gap Sync.

The bug has surfaced by changing the `import_existing` from false to
true in:
- #10373 

### Issue

The issue manifests for collators that have an unfilled block gap in
their DB.

During restarting with #10373, a collator would try the following:
- client info has detected a gap at block 5800 with length 1
- collator [X] requests the block 5800 with `fields: HEADER | BODY |
JUSTIFICATION, from: Number(5800)`
- the other 2 collators respond with the full block, including the body,
because by default collators will keep around the canonical chain but
discard the block state
- collator [X] tries to import the block because `import_existing` is
true and we continue execution after the following check:


https://github.com/paritytech/polkadot-sdk/blob/2b9576c163b1c2408291e2b6c98ae0f2465b4818/substrate/client/service/src/client/client.rs#L1809-L1812

- Before the changes, the code returned `return
Ok(ImportResult::AlreadyInChain)` which short-circuited the importing of
the block

- collator [X] imports the block but fails with `State already
discarded`
- the error is propagated back to the sync engine that decides to
restart the sync process with the same block gap `Restarting sync with
client ...`
- This results in a vicious cycle where the collator [X] requests the
same block again, then restarts the sync engine
- Eventually at the 3 request the other collators will notice that this
behavior is malicious and ban and disconnect the peers.

### Fix

The fix is to skip executing blocks when the gap sync has marked blocks
as `StateAction::Skip`.

Please note we are still dealing with the following, which should be
part of a different PR:
- Gap Sync was never closed from the database
- When the node starts with a block gap, the node will always initiate a
block request over the sync protocol to close the gap
- Before the gap was marked as `import_existing: false` which short
ciruited the circuit and returned `AlreadyInChain`
- Effectively nodes would re-request the gap on reboot wasting
networking bandwidth to close the gap "in memory" only, but this was
never commited to the DB


### Full Logs

```rust
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, finalized_number: 5883372, finalized_state: Some((0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, 5883372)), number_leaves: 1,
	block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: None)    
2026-03-10 13:43:41.138 TRACE                 main sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)


2026-03-10 13:45:17.775 TRACE tokio-runtime-worker sync: [Parachain] New gap block request for 12D3KooWRejf1JYYjaaKhHAn28VJJR9ryZqs3wiGPsVjk6eFLLrn, (best:5883362, common:5883362)
	BlockRequest { id: 0, fields: HEADER | BODY | JUSTIFICATION, from: Number(5800), direction: Descending, max: Some(1) } 

2026-03-10 13:45:17.784 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Starting import of 1 blocks  (5800) (origin: GapSync)    
2026-03-10 13:45:17.784 TRACE tokio-runtime-worker sync::import-queue: [Parachain] Block 5800 (0x26dc…1cda) has 4 logs (origin: GapSync)    
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Error importing block 5800: 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda:
	Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91

2026-03-10 13:45:17.792  WARN tokio-runtime-worker sync: [Parachain] 💔 Error importing block 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda: consensus error: Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91    
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, finalized_number: 5883392, finalized_state: Some((0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, 5883392)), number_leaves: 1,
		block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }

	
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: Some((5800, 5800)))    
2026-03-10 13:45:17.792 TRACE tokio-runtime-worker sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)    
```

### Testing Done

- unblocks kusama yap 3392:
https://grafana.teleport.parity.io/goto/KBKfuhKDR?orgId=1
- left side of the graph is origin/master, right side is the patch
applied with connected peers


Closes:
- #11299

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
paritytech-release-backport-bot Bot pushed a commit that referenced this pull request Mar 12, 2026
… state (#11330)

This PR skips the execution of blocks when they are propagated to
importing via `StateAction::Skip`.

There is a bug in the import queue that is affecting collators, which is
that they should not execute blocks for non-archive collators that are
part of Gap Sync.

The bug has surfaced by changing the `import_existing` from false to
true in:
- #10373

### Issue

The issue manifests for collators that have an unfilled block gap in
their DB.

During restarting with #10373, a collator would try the following:
- client info has detected a gap at block 5800 with length 1
- collator [X] requests the block 5800 with `fields: HEADER | BODY |
JUSTIFICATION, from: Number(5800)`
- the other 2 collators respond with the full block, including the body,
because by default collators will keep around the canonical chain but
discard the block state
- collator [X] tries to import the block because `import_existing` is
true and we continue execution after the following check:

https://github.com/paritytech/polkadot-sdk/blob/2b9576c163b1c2408291e2b6c98ae0f2465b4818/substrate/client/service/src/client/client.rs#L1809-L1812

- Before the changes, the code returned `return
Ok(ImportResult::AlreadyInChain)` which short-circuited the importing of
the block

- collator [X] imports the block but fails with `State already
discarded`
- the error is propagated back to the sync engine that decides to
restart the sync process with the same block gap `Restarting sync with
client ...`
- This results in a vicious cycle where the collator [X] requests the
same block again, then restarts the sync engine
- Eventually at the 3 request the other collators will notice that this
behavior is malicious and ban and disconnect the peers.

### Fix

The fix is to skip executing blocks when the gap sync has marked blocks
as `StateAction::Skip`.

Please note we are still dealing with the following, which should be
part of a different PR:
- Gap Sync was never closed from the database
- When the node starts with a block gap, the node will always initiate a
block request over the sync protocol to close the gap
- Before the gap was marked as `import_existing: false` which short
ciruited the circuit and returned `AlreadyInChain`
- Effectively nodes would re-request the gap on reboot wasting
networking bandwidth to close the gap "in memory" only, but this was
never commited to the DB

### Full Logs

```rust
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, finalized_number: 5883372, finalized_state: Some((0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, 5883372)), number_leaves: 1,
	block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: None)
2026-03-10 13:43:41.138 TRACE                 main sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)

2026-03-10 13:45:17.775 TRACE tokio-runtime-worker sync: [Parachain] New gap block request for 12D3KooWRejf1JYYjaaKhHAn28VJJR9ryZqs3wiGPsVjk6eFLLrn, (best:5883362, common:5883362)
	BlockRequest { id: 0, fields: HEADER | BODY | JUSTIFICATION, from: Number(5800), direction: Descending, max: Some(1) }

2026-03-10 13:45:17.784 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Starting import of 1 blocks  (5800) (origin: GapSync)
2026-03-10 13:45:17.784 TRACE tokio-runtime-worker sync::import-queue: [Parachain] Block 5800 (0x26dc…1cda) has 4 logs (origin: GapSync)
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Error importing block 5800: 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda:
	Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91

2026-03-10 13:45:17.792  WARN tokio-runtime-worker sync: [Parachain] 💔 Error importing block 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda: consensus error: Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, finalized_number: 5883392, finalized_state: Some((0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, 5883392)), number_leaves: 1,
		block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }

2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: Some((5800, 5800)))
2026-03-10 13:45:17.792 TRACE tokio-runtime-worker sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)
```

### Testing Done

- unblocks kusama yap 3392:
https://grafana.teleport.parity.io/goto/KBKfuhKDR?orgId=1
- left side of the graph is origin/master, right side is the patch
applied with connected peers

Closes:
- #11299

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 3c93291)
paritytech-release-backport-bot Bot pushed a commit that referenced this pull request Mar 12, 2026
… state (#11330)

This PR skips the execution of blocks when they are propagated to
importing via `StateAction::Skip`.

There is a bug in the import queue that is affecting collators, which is
that they should not execute blocks for non-archive collators that are
part of Gap Sync.

The bug has surfaced by changing the `import_existing` from false to
true in:
- #10373

### Issue

The issue manifests for collators that have an unfilled block gap in
their DB.

During restarting with #10373, a collator would try the following:
- client info has detected a gap at block 5800 with length 1
- collator [X] requests the block 5800 with `fields: HEADER | BODY |
JUSTIFICATION, from: Number(5800)`
- the other 2 collators respond with the full block, including the body,
because by default collators will keep around the canonical chain but
discard the block state
- collator [X] tries to import the block because `import_existing` is
true and we continue execution after the following check:

https://github.com/paritytech/polkadot-sdk/blob/2b9576c163b1c2408291e2b6c98ae0f2465b4818/substrate/client/service/src/client/client.rs#L1809-L1812

- Before the changes, the code returned `return
Ok(ImportResult::AlreadyInChain)` which short-circuited the importing of
the block

- collator [X] imports the block but fails with `State already
discarded`
- the error is propagated back to the sync engine that decides to
restart the sync process with the same block gap `Restarting sync with
client ...`
- This results in a vicious cycle where the collator [X] requests the
same block again, then restarts the sync engine
- Eventually at the 3 request the other collators will notice that this
behavior is malicious and ban and disconnect the peers.

### Fix

The fix is to skip executing blocks when the gap sync has marked blocks
as `StateAction::Skip`.

Please note we are still dealing with the following, which should be
part of a different PR:
- Gap Sync was never closed from the database
- When the node starts with a block gap, the node will always initiate a
block request over the sync protocol to close the gap
- Before the gap was marked as `import_existing: false` which short
ciruited the circuit and returned `AlreadyInChain`
- Effectively nodes would re-request the gap on reboot wasting
networking bandwidth to close the gap "in memory" only, but this was
never commited to the DB

### Full Logs

```rust
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, finalized_number: 5883372, finalized_state: Some((0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, 5883372)), number_leaves: 1,
	block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: None)
2026-03-10 13:43:41.138 TRACE                 main sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)

2026-03-10 13:45:17.775 TRACE tokio-runtime-worker sync: [Parachain] New gap block request for 12D3KooWRejf1JYYjaaKhHAn28VJJR9ryZqs3wiGPsVjk6eFLLrn, (best:5883362, common:5883362)
	BlockRequest { id: 0, fields: HEADER | BODY | JUSTIFICATION, from: Number(5800), direction: Descending, max: Some(1) }

2026-03-10 13:45:17.784 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Starting import of 1 blocks  (5800) (origin: GapSync)
2026-03-10 13:45:17.784 TRACE tokio-runtime-worker sync::import-queue: [Parachain] Block 5800 (0x26dc…1cda) has 4 logs (origin: GapSync)
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Error importing block 5800: 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda:
	Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91

2026-03-10 13:45:17.792  WARN tokio-runtime-worker sync: [Parachain] 💔 Error importing block 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda: consensus error: Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, finalized_number: 5883392, finalized_state: Some((0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, 5883392)), number_leaves: 1,
		block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }

2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: Some((5800, 5800)))
2026-03-10 13:45:17.792 TRACE tokio-runtime-worker sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)
```

### Testing Done

- unblocks kusama yap 3392:
https://grafana.teleport.parity.io/goto/KBKfuhKDR?orgId=1
- left side of the graph is origin/master, right side is the patch
applied with connected peers

Closes:
- #11299

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 3c93291)
paritytech-release-backport-bot Bot pushed a commit that referenced this pull request Mar 12, 2026
… state (#11330)

This PR skips the execution of blocks when they are propagated to
importing via `StateAction::Skip`.

There is a bug in the import queue that is affecting collators, which is
that they should not execute blocks for non-archive collators that are
part of Gap Sync.

The bug has surfaced by changing the `import_existing` from false to
true in:
- #10373

### Issue

The issue manifests for collators that have an unfilled block gap in
their DB.

During restarting with #10373, a collator would try the following:
- client info has detected a gap at block 5800 with length 1
- collator [X] requests the block 5800 with `fields: HEADER | BODY |
JUSTIFICATION, from: Number(5800)`
- the other 2 collators respond with the full block, including the body,
because by default collators will keep around the canonical chain but
discard the block state
- collator [X] tries to import the block because `import_existing` is
true and we continue execution after the following check:

https://github.com/paritytech/polkadot-sdk/blob/2b9576c163b1c2408291e2b6c98ae0f2465b4818/substrate/client/service/src/client/client.rs#L1809-L1812

- Before the changes, the code returned `return
Ok(ImportResult::AlreadyInChain)` which short-circuited the importing of
the block

- collator [X] imports the block but fails with `State already
discarded`
- the error is propagated back to the sync engine that decides to
restart the sync process with the same block gap `Restarting sync with
client ...`
- This results in a vicious cycle where the collator [X] requests the
same block again, then restarts the sync engine
- Eventually at the 3 request the other collators will notice that this
behavior is malicious and ban and disconnect the peers.

### Fix

The fix is to skip executing blocks when the gap sync has marked blocks
as `StateAction::Skip`.

Please note we are still dealing with the following, which should be
part of a different PR:
- Gap Sync was never closed from the database
- When the node starts with a block gap, the node will always initiate a
block request over the sync protocol to close the gap
- Before the gap was marked as `import_existing: false` which short
ciruited the circuit and returned `AlreadyInChain`
- Effectively nodes would re-request the gap on reboot wasting
networking bandwidth to close the gap "in memory" only, but this was
never commited to the DB

### Full Logs

```rust
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, finalized_number: 5883372, finalized_state: Some((0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, 5883372)), number_leaves: 1,
	block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: None)
2026-03-10 13:43:41.138 TRACE                 main sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)

2026-03-10 13:45:17.775 TRACE tokio-runtime-worker sync: [Parachain] New gap block request for 12D3KooWRejf1JYYjaaKhHAn28VJJR9ryZqs3wiGPsVjk6eFLLrn, (best:5883362, common:5883362)
	BlockRequest { id: 0, fields: HEADER | BODY | JUSTIFICATION, from: Number(5800), direction: Descending, max: Some(1) }

2026-03-10 13:45:17.784 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Starting import of 1 blocks  (5800) (origin: GapSync)
2026-03-10 13:45:17.784 TRACE tokio-runtime-worker sync::import-queue: [Parachain] Block 5800 (0x26dc…1cda) has 4 logs (origin: GapSync)
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Error importing block 5800: 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda:
	Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91

2026-03-10 13:45:17.792  WARN tokio-runtime-worker sync: [Parachain] 💔 Error importing block 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda: consensus error: Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, finalized_number: 5883392, finalized_state: Some((0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, 5883392)), number_leaves: 1,
		block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }

2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: Some((5800, 5800)))
2026-03-10 13:45:17.792 TRACE tokio-runtime-worker sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)
```

### Testing Done

- unblocks kusama yap 3392:
https://grafana.teleport.parity.io/goto/KBKfuhKDR?orgId=1
- left side of the graph is origin/master, right side is the patch
applied with connected peers

Closes:
- #11299

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 3c93291)
paritytech-release-backport-bot Bot pushed a commit that referenced this pull request Mar 12, 2026
… state (#11330)

This PR skips the execution of blocks when they are propagated to
importing via `StateAction::Skip`.

There is a bug in the import queue that is affecting collators, which is
that they should not execute blocks for non-archive collators that are
part of Gap Sync.

The bug has surfaced by changing the `import_existing` from false to
true in:
- #10373

### Issue

The issue manifests for collators that have an unfilled block gap in
their DB.

During restarting with #10373, a collator would try the following:
- client info has detected a gap at block 5800 with length 1
- collator [X] requests the block 5800 with `fields: HEADER | BODY |
JUSTIFICATION, from: Number(5800)`
- the other 2 collators respond with the full block, including the body,
because by default collators will keep around the canonical chain but
discard the block state
- collator [X] tries to import the block because `import_existing` is
true and we continue execution after the following check:

https://github.com/paritytech/polkadot-sdk/blob/2b9576c163b1c2408291e2b6c98ae0f2465b4818/substrate/client/service/src/client/client.rs#L1809-L1812

- Before the changes, the code returned `return
Ok(ImportResult::AlreadyInChain)` which short-circuited the importing of
the block

- collator [X] imports the block but fails with `State already
discarded`
- the error is propagated back to the sync engine that decides to
restart the sync process with the same block gap `Restarting sync with
client ...`
- This results in a vicious cycle where the collator [X] requests the
same block again, then restarts the sync engine
- Eventually at the 3 request the other collators will notice that this
behavior is malicious and ban and disconnect the peers.

### Fix

The fix is to skip executing blocks when the gap sync has marked blocks
as `StateAction::Skip`.

Please note we are still dealing with the following, which should be
part of a different PR:
- Gap Sync was never closed from the database
- When the node starts with a block gap, the node will always initiate a
block request over the sync protocol to close the gap
- Before the gap was marked as `import_existing: false` which short
ciruited the circuit and returned `AlreadyInChain`
- Effectively nodes would re-request the gap on reboot wasting
networking bandwidth to close the gap "in memory" only, but this was
never commited to the DB

### Full Logs

```rust
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, finalized_number: 5883372, finalized_state: Some((0x43664710059a72b37c11db9f99a0f38323b478fbdc82afac058c530c7b002e4d, 5883372)), number_leaves: 1,
	block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }
2026-03-10 13:43:41.138 DEBUG                 main sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: None)
2026-03-10 13:43:41.138 TRACE                 main sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)

2026-03-10 13:45:17.775 TRACE tokio-runtime-worker sync: [Parachain] New gap block request for 12D3KooWRejf1JYYjaaKhHAn28VJJR9ryZqs3wiGPsVjk6eFLLrn, (best:5883362, common:5883362)
	BlockRequest { id: 0, fields: HEADER | BODY | JUSTIFICATION, from: Number(5800), direction: Descending, max: Some(1) }

2026-03-10 13:45:17.784 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Starting import of 1 blocks  (5800) (origin: GapSync)
2026-03-10 13:45:17.784 TRACE tokio-runtime-worker sync::import-queue: [Parachain] Block 5800 (0x26dc…1cda) has 4 logs (origin: GapSync)
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync::import-queue: [Parachain] Error importing block 5800: 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda:
	Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91

2026-03-10 13:45:17.792  WARN tokio-runtime-worker sync: [Parachain] 💔 Error importing block 0x26dca166cfefe439262d201b10a8d2679edc4bd98ae59fe12d7f7eef9b871cda: consensus error: Api called for an unknown Block: State already discarded for 0x4739cf07649d6383bb19d2adccbe9d3f5b1ed91ef5fd6530bc8e69e560b5be91
2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Restarting sync with client info Info { best_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, best_number: 5883392, genesis_hash: 0x8692fdabb7e55c3347c0f887343e3c0f3fbb560c5f52c9cdc1f7660a1f183c5d, finalized_hash: 0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, finalized_number: 5883392, finalized_state: Some((0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102, 5883392)), number_leaves: 1,
		block_gap: Some(BlockGap { start: 5800, end: 5800, gap_type: MissingBody }) }

2026-03-10 13:45:17.792 DEBUG tokio-runtime-worker sync: [Parachain] Starting gap sync #5800 - #5800 (old gap best and target: Some((5800, 5800)))
2026-03-10 13:45:17.792 TRACE tokio-runtime-worker sync: [Parachain] Restarted sync at #5883392 (0xcb03c2aa7dd61f84b27d4c7db42ab848d2eaee9da77ddedc827e070ece063102)
```

### Testing Done

- unblocks kusama yap 3392:
https://grafana.teleport.parity.io/goto/KBKfuhKDR?orgId=1
- left side of the graph is origin/master, right side is the patch
applied with connected peers

Closes:
- #11299

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 3c93291)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A4-backport-stable2603 Pull request must be backported to the stable2603 release branch A5-run-CI Run CI on draft PR T0-node This PR/Issue is related to the topic “node”.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants